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Abstract 

A  class  of  multiscale  stochastic  models  based  on  scale- recursive  dynamics  on  trees  has  recently 
been  introduced.  These  models  we  interesting  because  they  can  be  used  to  represent  a  broad 
class  of  physical  phenomena  and  because  they  lead  to  efficient  algorithms  for  estimation  and 
likelihood  calculation.  In  this  paper,  we  provide  a  complete  statistical  characterization  of  the 
error  associated  with  smoothed  estimates  of  the  multiscale  stochastic  processes  described  by 
these  models.  In  particular,  we  show  that  the  smoothing  error  is  itself  a  multiscale  stochastic 
process  with  parameters  which  can  be  explicitly  calculated. 
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1  Introduction 


A  class  of  mtdtiscale  models  describing  stochastic  processes  indexed  by  the  nodes  of  a  tree  has 
recently  been  introduced  in  [1,  2].  These  models  can  be  used  to  capture  a  surprisingly  rich  class 
of  physical  phenomena.  For  instance,  experimental  results  in  [2]  illustrate  that  they  can  be  used 
to  model  the  statistical  self-siinilarity  exhibited  by  stochastic  processes  with  genercilized  power 
spectra  of  the  form  1//^,  and  in  [3]  we  describe  how  they  csm  be  used  to  represent  any  1-D  Markov 
process  or  2-D  Markov  random  field.  Moreover,  this  class  of  models  leads  to  efficient  algorithms 
for  estimation  and  likelihood  calculation  and  as  a  result  provides  a  useful  framework  for  a  variety 
of  signal  and  image  processing  problems  [1,  2,  4,  5,  6]. 

Knowledge  of  the  error  statistics  of  smoothed  estimates  of  such  processes  is  essential  for  the 
development  of  a  number  of  important  new  applications,  including  for  instance  so-called  mapping 
problems  [7],  the  multiscale  counterpart  to  the  model  validation  problem  in  [8],  and  certain  oceano¬ 
graphic  problems  [9].  Several  such  applications  have  been  developed  in  the  context  of  1-D  Gauss- 
Markov  models  by  exploiting  relatively  recent  results  which  show  that  the  smoothing  error  processes 
associated  with  Gauss-Markov  models  are  themselves  Gauss-Markov  processes  [7,  8,  10,  11]^.  In 
this  paper,  we  derive  a  dynamic  model  for  the  smoothing  error  process  associated  with  multiscsde 
stochastic  models.  In  peurticular,  we  show  that  the  smoothing  error  is  itself  a  multiscale  stochas¬ 
tic  process  with  parameters  which  can  be  explicitly  computed.  These  results  generalize  previous 
results  for  Gauss-Markov  processes,  since  these  processes  correspond  to  a  degenerate  form  of  the 
multiscale  models,  and  provide  the  necessary  framework  for  applications  such  as  those  mentioned 
above. 

This  paper  is  organized  as  follows.  In  Section  2  we  briefly  review  the  class  of  multiscale  stochastic 

^Morc  generally,  Levy  et  al.  [12]  have  recently  shown  that  the  smoothing  error  processes  associated  with  the 
class  of  Gaussian  reciprocal  processes,  which  contains  the  class  of  Gauss-Markov  processes,  are  themselves  Gaussian 
reciprocal.  See  also  [13]  for  similar  results  corresponding  to  2-D  Gauss-Markov  random  fields. 
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models  of  interest  here  and  the  sccde-recttrsive  estimation  algorithm  associated  with  them.  In 
Section  3  we  derive  a  multiscale  model  for  the  smoothing  error  process. 

2  Multiscale  Stochastic  Modeling  and  Optimal  Estimation 

The  models  presented  in  this  section  describe  multiscale  Gaussian  stochastic  processes  indexed  by 
nodes  on  a  tree.  A  order  tree  is  a  pyramidal  structme  of  nodes  connected  such  that  each  node 
of  the  tree  has  q  offspring  (see  Figure  1).  We  denote  nodes  on  the  tree  with  an  abstract  index  s, 
and  define  an  upward  (fine-to-coarse)  shift  operator  7  such  that  s-y  is  the  parent  of  node  s.  We  also 
define  a  corresponding  set  of  downward  shift  operators  a^,  •  •  • ,  a,  such  that  sai,  •  •  • ,  sa,  are  the 
offspring  of  node  s.  In  addition,  we  denote  the  set  of  nodes  on  the  tree  as  T  and  the  set  of  nodes 
which  includes  node  s  and  ^dl  of  its  descendants  as  T,,  i.e.  T,  =  (crjcr  =  s  or  cr  is  a  descendzmt  of  5}. 
Also,  the  complement  of  T,  is  denoted  The  statistical  characterization  of  model  state  a:(a)  6  TZ^ 
is  then  given  by: 

z(a)  =  A(s)z(57)  +  B{s)w{s)  (1) 

under  the  assumptions  that  i(0)  ~  7^(0,  P(0)),  •u;(s)  ~  7V'(0,/),  A(s)  and  B{s)  are  matrices  of 
appropriate  size,  and  s  =  0  is  the  root  node  at  the  top  of  the  tree.  The  driving  noise  w{s)'  £  is 
white,  i.e.  u;(s)  and  w{a)  are  independent  if  s  7^  <r,  and  independent  of  the  initiad  condition  i(0). 

The  class  of  models  (1)  has  a  statistical  structure  that  can  be  exploited  to  develop  efficient 
signal  processing  cilgorithms.  In  particular,  note  that  ainy  given  node  on  the  q‘^-order  tree  can  be 
viewed  as  a  boundary  between  g  +  1  subsets  of  nodes  (g  corresponding  to  paths  leading  towards 
offspring  and  one  corresponding  to  a  path  leading  towards  a  parent).  An  important  property  of 
the  model  (1)  is  that,  conditioned  on  the  value  of  the  state  at  any  node,  the  values  of  the  state 
corresponding  to  the  g  +  1  corresponding  subsets  of  nodes  sue  independent.  This  fact  is  the  basis 
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for  the  development  in  [1,  2]  of  an  algorithm  for  computing  smoothed  estimates  of  i(s)  based  on 
noisy  measurements  y{s)  G  of  the  form: 

y{s)  =  C{s)x{s)  +  i;(s)  (2) 

where  v{s)  ~  J^{Q,R{3)),  and  is  independent  of  both  w{s)  and  z(0).  The  algorithm  for  computing 
the  smoothed  estimates  of  e(s)  is  a  generalization  to  g^^-order  trees  of  the  well-known  Rauch- Tung- 
Striebel  adgorithm  for  smoothing  1-D  Gauss-Markov  processes.  We  briefly  review  this  algorithm 
next,  and  then  derive  a  general" mod^ Tor  the  error  associated  with  the  smoothed  estimates. 

We  denote  the  set  of  states  defined  at  nodes  in  7^  as  X,,  i.e.  X,  =  {2:(o’)}o'6T.  j  and  similarly 
T,  =  The  set  of  measurements  in  the  subtree  strictly  below  s  is  denoted  Y^'' ,  i.e. 

y/*’  =  {t/(cr)lcr  is  a  descendant  of  s}.  We  also  define  z(siy)  as  the  expected  value  of  1(5)  given 
measurements  in  the  set  Y  cind  the  corresponding  error  covariance  as  P(sjy). 

The  upward  sweep  of  the  smoothing  algorithm  begins  with  the  initialization  of  i(s|y“’)  and 
P(siyj“’)  at  the  finest  level.  In  particular,  for  every  s  at  this  finest  scale  we  set  i(siy,“’)  to  zero 
and  P(s|y/‘’)  to  the  solution  at  the  finest  level  of  the  tree  of  the  Lyaptmov  equation: 

P(s)  =  A{s)P{s^)A^{3)  + B{s)B^{s)  (3) 

where  P{3)  denotes  the  covariance  of  the  process  i(s)  at  node  s.  Suppose  then  that  we  have 
z(s|y“’)  and  P(s|y“’)  at  a  given  node  s.  This  estimate  is  updated  to  incorporate  the  measurement 
y{3)  according  to  the  following: 

i(s|y)  =  z(siy“*)  +  if  (s)[y(s)  -  G(s)i(siy“0]  (4) 

P(siy,)  =  [j-if(s)c(s)]P(s|y“’)  (5) 

where  if(s)  =  P(s|y“’)C^(s)(C(3)P(siyr’)C^(s)  -f  P(s)]-^ 

Suppose  next  that  we  have  the  updated  estimates  z(sa,-|y,aj  at  all  of  the  immediate  descendants 
of  node  s.  The  next  step  involves  the  use  of  these  estimates  to  predict  i(s)  at  the  next  coarser 
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scale,  i.e.  to  compute  i(3|Y]tai)-  Using  the  following  upward  model  for  the  mxiltiscale  process  [1,  2]: 

1(57)  =  f’(s)a:(5)  +  tD(s)  (6) 

with  the  measurement  equation  again  given  by  (2),  and  where  F{s)  =  P{s^)A^{s)P{s)~^  and 
E['u)(s)u)^(s)]  =  P{sj)  -  P{sj)A^{s)P{s)~'^ A{s)P{sj)  =  Q{s),  we  compute  the  fine-to-coairse 


predicted  estimates: 

=  F(sai)i(sai|y,aJ  (7) 

P(siy.„,)  =  F{sai)P{scxi\Y.„,)F^{sai)FQ{sai)  •  (8) 

The  estimates  z(3|Y,aJ,  i  =  1,  •  •  • ,  g  are  then  merged  to  obtain 

i(siy«0  =  p(siy“^)2p-i(3iy,<,,)i(sjy.,,)  (9) 

j=i 

p(s|y“’)  =  [(i-g)p(5)-^  +  ^p-i(siy<,,)]-^  (10) 

t=i 

The  recursion  proceeds  up  the  tree  until  one  obtains  the  smoothed  estimate  of  the  root  node, 
z(0|yo).  This  estimate  initializes  a  downward  sweep  in  which  i(s|yo)  is  computed  according  to 

i(s|yo)  =  i(.|y)  +  j(5)[f(s7|:i^)-£(niy.)]  (11) 

p(s|yo)  =  P(s|y)  +  j(s)[p(s7|yo)  -  p(57in)]  (12) 

J{s)  =  P{s\Y.)F^{s)p-^{sj\Y.)  (13) 


Note  that  (12)  characterizes  the  smoothing  error  covariance  at  any  given  lattice  site  s,  but  does 
not  provide  information  about  the  correlation  structure  of  the  error  process.  The  goal  in  the  next 
section  is  to  provide  a  multiscale  model  for  the  smoothing  error  process,  i.e.  to  show  that  the  error 
satisfies  a  recursion  of  the  form  (1),  and  to  calculate  the  associated  model  parameters.  This  then 
provides  the  complete  statistical  characterization  of  the  smoothing  error  that  we  seek. 
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3  Multiscaie  Smoothing  Error  Models 


Given  two  nodes  s  and  <7  G  "77  on  the  tree,  we  can  always  represent  x(a)  in  terms  of  x(sj)  tind  an 
additive  noise  term  (pa-,sy- 

z(o-)  =  $o-,,t2(s7)  +  (14) 

by  tracing  a  path  from  a  to  sj  aind  using  the  upwEird  dynamics  (6)  and  downwtird  dynamics  (1)  to 
eliminate  state  variables  adong  the  way.  The  state  transition  matrix  is  a  function  of  the  upward 
and  downward  prediction  matrices  A  and  F  along  the  path,  whereas  is  a  lineair  function  of  the 

upward  and  downward  driving  noises  w  and  w.  For  instance,  the  state  2(sai)  at  the  offspring 
of  s  can  be  written  in  terms  of  the  state  at  the  offspring  as: 

i(sai)  =  [A(sai)F’(saj)]2(saj-)  +  [A(sai)ti>(saj)  +  5(5aj)iu(sai)]  (15) 

By  construction,  <Pa-,$y  is  independent  of  the  set  of  states  2(37)  UX,,  as  weE  as  the  corresponding  set 
of  measurements  y{sj)  U  Y..  This  implies  that  x{(t\Y,)  =  ^^^gyx{sj\Y,)  which,  using  (14),  implies 
that: 

x{a\Y,)  =  ^^^,yx{sj\Y,)  +  (16) 

where  we  have  defined  the  error  in  2(3]^)  as  2(3|y)  =  2(3)  -  z(3iy).  As  a  result,  we  see  that 
z(3|y,)  has  the  following  Markov  property: 

E{2(3iy.)|x(aiy.),<TGT;}  =  E{x{s\Y.)\x{sj\Y,),{ip,,,„,  eT;}} 

=  E{2(3iydlx(37iy.)}  +  E{2(3|y,)i{^,,,7, « e  77}} 

=  E{2(3|y.)i2(37iy,)}  (17) 

The  first  equality  in  (17)  follows  from  (16),  the  second  from  the  orthogonality  of  fp,r,»y  to  2(37)  and 
y,,  and  the  last  from  the  orthogonality  of  ¥’o-,i7  to  2(3)  and  Y,.  Now,  using  the  upward  dynamics 
(6),  the  upward  sweep  prediction  equation  (7)  and  standaird  linear  least  squares  formulae  we  can 
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write; 

z(5iy,)  =  J{s)i{s^\Y,)  +  w{s)  (18) 

where  J{s)  is  given  by  (13)  and  where,  from  (17),  w{s)  is  independent  of  {K(<7js)}orer/ >  li-as 
covariance: 

P(3|y,)  -  (^3) 

Next,  note  that  the  independence  of  w{s)  and  {£(o‘|Y’,)}o-gr,'=  implies  that  ti;(s)  is  also  independent 
of  the  residual  information  about  1(3)  which  is  contained  in  the  set  of  all  available  measurements 
io,  but  not  contained  in  Y,.  In  particulzir,  at  each  node  in  a  residual  component  t^sicr)  whicli  is 
orthogonal  to  the  measurements  in  the  set  Y,  can  be  defined  as; 

=  y(cr)  -  E{y(<r)|y,} 

=  C7(<T)£(cr|y,)  +  t;(o-)  (20) 

Denoting  v,  =  {i'«(<^)}o-€T')  it  i®  clear  that  span  Yq  =  span  {y,,!/,},  that  v,  J_  Y,  and  that  v,  J. 
w{s).  Taking  the  expected  value  of  both  sides  of  (18)  conditioned  on  v,,  we  obtain; 

E{£(3iy,)lz/.}  =  J{s)E{x{sr\Y.)\u,}  (21) 

Finally,  noting  that 

i(3|yo)  =  i(3iy,)  +  E{i(3iy.)|i/,}  (22) 

and  then  subtracting  (21)  from  (18)  results  in: 

£(3iyo)  =  J(j)z(s7iyo)  +  u)(s)  (23) 

which  is  a  multiscale  model  for  the  smoothing  error  of  precisely  the  same  form  as  (1). 

This  model  is,  of  course,  consistent  with  the  error  covEurizince  computation  in  (12).  In  particular, 
using  the  Lyapimov  equation  for  (23)  we  obtain: 

F(3|yo)  =  J{s)P{sWo)J^{s)  +  P{s\Y.)  -  P{s\Y.)F^{s)p-\s^\Y.)F{s)P{s\Y,) 
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P{s\Y.)  ^  Jis)[P{sy\Yo)  -  P{s^\Y,)]J^{s) 


(24) 


In  addition,  on  first-order  trees,  the  model  (1)  reduces  to  a  standard  Gauss-Markov  model,  and 
hence  (23)  generalizes  to  g^^-order  trees  the  corresponding  1-D  time-series  result.  The  derivation 
here  is  related  to,  but  is  in  fact  substantially  simpler  than,  the  derivation  based  on  backwards 
prediction  error  models  in  [8]. 
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Figure  1:  Multiscale  stochastic  processes  are  indexed  by  the  g‘^-order  tree.  The  parent  of  a  node 
s  on  the  tree  is  denoted  37,  and  its  q  offspring  are  denoted  sai,  ■  ■  ■ ,  sa^. 
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